826 research outputs found

    Simultaneous image transformation and sparse representation recovery

    Get PDF
    Sparse representation in compressive sensing is gaining increasing attention due to its success in various applications. As we demonstrate in this paper, however, image sparse representation is sensitive to image plane transformations such that existing approaches can not reconstruct the sparse representation of a geometrically transformed image. We introduce a simple technique for obtaining transformation-invariant image sparse representation. It is rooted in two observations: 1) if the aligned model images of an object span a linear subspace, their transformed versions with respect to some group of transformations can still span a linear subspace in a higher dimension; 2) if a target (or test) image, aligned with the model images, lives in the above subspace, its pre-alignment versions would get closer to the subspace after applying estimated transformations with more and more accurate parameters. These observations motivate us to project a potentially unaligned target image to random projection manifolds defined by the model images and the transformation model. Each projection is then separated into the aligned projection target and a residue due to misalignment. The desired aligned projection target is then iteratively optimized by gradually diminishing the residue. In this framework, we can simultaneously recover the sparse representation of a target image and the image plane transformation between the target and the model images. We have applied the proposed methodology to two applications: face recognition, and dynamic texture registration. The improved performance over previous methods that we obtain demonstrates the effectiveness of the proposed approach. 1

    AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks

    Full text link
    In this paper, we propose an Attentional Generative Adversarial Network (AttnGAN) that allows attention-driven, multi-stage refinement for fine-grained text-to-image generation. With a novel attentional generative network, the AttnGAN can synthesize fine-grained details at different subregions of the image by paying attentions to the relevant words in the natural language description. In addition, a deep attentional multimodal similarity model is proposed to compute a fine-grained image-text matching loss for training the generator. The proposed AttnGAN significantly outperforms the previous state of the art, boosting the best reported inception score by 14.14% on the CUB dataset and 170.25% on the more challenging COCO dataset. A detailed analysis is also performed by visualizing the attention layers of the AttnGAN. It for the first time shows that the layered attentional GAN is able to automatically select the condition at the word level for generating different parts of the image

    Token Imbalance Adaptation for Radiology Report Generation

    Full text link
    Imbalanced token distributions naturally exist in text documents, leading neural language models to overfit on frequent tokens. The token imbalance may dampen the robustness of radiology report generators, as complex medical terms appear less frequently but reflect more medical information. In this study, we demonstrate how current state-of-the-art models fail to generate infrequent tokens on two standard benchmark datasets (IU X-RAY and MIMIC-CXR) of radiology report generation. % However, no prior study has proposed methods to adapt infrequent tokens for text generators feeding with medical images. To solve the challenge, we propose the \textbf{T}oken \textbf{Im}balance Adapt\textbf{er} (\textit{TIMER}), aiming to improve generation robustness on infrequent tokens. The model automatically leverages token imbalance by an unlikelihood loss and dynamically optimizes generation processes to augment infrequent tokens. We compare our approach with multiple state-of-the-art methods on the two benchmarks. Experiments demonstrate the effectiveness of our approach in enhancing model robustness overall and infrequent tokens. Our ablation analysis shows that our reinforcement learning method has a major effect in adapting token imbalance for radiology report generation.Comment: Accepted by CHIL202

    Enriching Unsupervised User Embedding via Medical Concepts

    Full text link
    Clinical notes in Electronic Health Records (EHR) present rich documented information of patients to inference phenotype for disease diagnosis and study patient characteristics for cohort selection. Unsupervised user embedding aims to encode patients into fixed-length vectors without human supervisions. Medical concepts extracted from the clinical notes contain rich connections between patients and their clinical categories. However, existing unsupervised approaches of user embeddings from clinical notes do not explicitly incorporate medical concepts. In this study, we propose a concept-aware unsupervised user embedding that jointly leverages text documents and medical concepts from two clinical corpora, MIMIC-III and Diabetes. We evaluate user embeddings on both extrinsic and intrinsic tasks, including phenotype classification, in-hospital mortality prediction, patient retrieval, and patient relatedness. Experiments on the two clinical corpora show our approach exceeds unsupervised baselines, and incorporating medical concepts can significantly improve the baseline performance.Comment: accepted at ACM CHIL 2022. a revision for section reforma

    Research on Safety Investment Decision Evaluation and Optimization of Network Booking Taxi Platform Enterprise based on Subjective-Objective Assessment Method

    Get PDF
    This study addresses the current problem of disproportion between the investment and return of safety operation of Network Booking Taxi Platform Enterprises (NBTPE). This study selects the more representative NBTPE in the domestic travel field, and further forms a graph of safety input law based on the impact analysis of internal and external safety inputs by applying the System Dynamics method. Based on the comprehensive use of subjective empowerment method represented by analytical hierarchy process and objective empowerment method represented by entropy weight method, the study proposes the method of determining the reasonable proportion of each safety input cost through the comprehensive Subjective-Objective Assessment Method, and evaluates the feasibility and reasonableness of the method by using the method of linear regularization. Further the study concluded that enterprises need to increase the investment in equipment and facilities in the field of safety investment, while the proportion of investment in different links was measured and suggestions were made to optimize the current proportion of safety investment in NBTPE. This study provides support for optimizing the safety investment ratio of platform companies and improving the efficiency of safety management

    What Matters for Neural Cross-Lingual Named Entity Recognition: An Empirical Analysis

    Full text link
    Building named entity recognition (NER) models for languages that do not have much training data is a challenging task. While recent work has shown promising results on cross-lingual transfer from high-resource languages to low-resource languages, it is unclear what knowledge is transferred. In this paper, we first propose a simple and efficient neural architecture for cross-lingual NER. Experiments show that our model achieves competitive performance with the state-of-the-art. We further analyze how transfer learning works for cross-lingual NER on two transferable factors: sequential order and multilingual embeddings, and investigate how model performance varies across entity lengths. Finally, we conduct a case-study on a non-Latin language, Bengali, which suggests that leveraging knowledge from Wikipedia will be a promising direction to further improve the model performances. Our results can shed light on future research for improving cross-lingual NER.Comment: 7 page

    Hierarchical Passenger Hub Location Problem in a Megaregion Area Considering Service Availability

    Get PDF
    The rapid growth of the intercity travel demand has resulted in enormous pressure on the passenger transportation network in a megaregion area. Optimally locating hubs and allocating demands to hubs influence the effectiveness of a passenger transportation network. This study develops a hierarchical passenger hub location model considering the service availability of hierarchical hubs. A mixed integer linear programming formulation was developed to minimize the total cost of hub operation and transportation for multiple travel demands and determine the proportion of passengers that access hubs at each level. This model was implemented for the Wuhan metropolitan area in four different scenarios to illustrate the applicability of the model. Then, a sensitivity analysis was performed to assess the impact of changing key parameters on the model results. The results are compared to those of traditional models, and the findings demonstrate the importance of considering hub choice behavior in demand allocation
    • ā€¦
    corecore